Improved routing strategies for Internet traffic delivery 
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We analyze different strategies aimed at optimizing routing policies in the Internet. We first 
show that for a simple deterministic algorithm the local properties of the network deeply influence 
the time needed for packet delivery between two arbitrarily chosen nodes. We next rely on a real 
Internet map at the autonomous system level and introduce a score function that allows us to 
examine different routing protocols and their efficiency in traffic handling and packet delivery. Our 
results suggest that actual mechanisms are not the most efficient and that they can be integrated 
in a more general, though not too complex, scheme. 
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Modern society increasingly depends on large commu- 
nication networks such as the Internet. The need for 
information spreading pervades our lives and its efficient 
handling and delivery is becoming one of the most im- 
portant practical problems. To this end, a suite of pro- 
tocols for the dissemination of information from a given 
source to thousands of users has been developed in the 
last several years 1] . The Internet and other communica- 
tion networks certainly work in a reliable way. However, 
both the physical network and the numbers of users are 
growing continuously. The scalability of current proto- 
cols as well as their performance for larger system sizes 
and heavier loads on the network are critical issues to be 
addressed in order to guarantee networks' functioning in 
the near future. From this perspective, one of the fun- 
damental problems we face nowadays is to find optimal 
strategies for packet delivery between a given sending 
node and its destination host. 

The above problem demands taking into account at 
least two factors. The first is related to the routers and 
servers capabilities — mainly, memory requirements and 
CPU processing time — needed for the different algo- 
rithms to operate efficiently. Secondly, it has been re- 
cently shown that the real architecture of communication 
networks determines many of their properties in front of 
dynamical processes such as the resilience to random fail- 
ures and attacks and the spreading of virus and rumors 
|H & [J S The latter has been achieved in recent 
years by unraveling the complex patterns of interconnec- 
tions that characterize seemingly diverse systems such as 
the Internet, the World Wide Web (WWW), biological 
and social networks 0, H, • It turns out that most real 
networks can be described by growing models in which 
the number of nodes (or elements) forming the net in- 
creases with time and that the probability that a given 
node has k connections to other nodes follows a power 
law Pk ~ /c~ 7 . This is the case of the Internet which 
shows a scale-free (SF) connectivity distribution with an 
exponent that has been estimated to be around 7 = 2.2 





The aim of this paper is twofold. First, we show that 
the local properties of the networks on top of which a 
packet delivery process is taking place determine its effi- 
ciency. Then, we turn our attention on developing al- 
ternative strategies for Internet traffic routing by im- 
plementing a general stochastic protocol on top of a 
real Internet map at the autonomous system (AS) level. 
Through Monte Carlo simulations we show that the ac- 
tual Internet's protocol is less efficient than one which 
combines the knowledge of the topolog y an d the traffic 
handled by the network at a local scale ■ Our results 
might provide new hints for routing protocol designs as 
they are shown to need roughly the same capabilities of 
current routers. 

In order to discuss how the local topological proper- 
ties influence the efficiency of a given routing protocol, 
we use the network studied in Ref . [Tlj . In this model, 
the network is generated by considering the Barabasi- 
Albert (BA) procedure 01 but introducing an affinity 
variable /j and a tolerance //, which determine the peers 
j a new node can attach to. This is done by requiring that 
fj e (fi±fi). This network shows the same global prop- 
erties of the BA net regardless of the tolerance. However, 
depending on the value of /j,, other local properties, such 
as the clustering coefficient and correlations, differ from 
the original BA network. 

Once we have a network whose local properties can be 
tuned, we proceed to define our basic routing protocol. In 
communication networks like the Internet, routers deliver 
data packets by ensuring that all routers converge to a 
best estimate of the path leading to each destination ad- 
dress. In other words, the routing process takes place fol- 
lowing the criterion of the shortest available path length 
from a given source to its destination. Inspired by this 
protocol and with the aim of comparing its performance 
with other strategies, we define the following algorithm 
for packet delivery, which will be called the standard pro- 
tocol. At the beginning, p packets are created and both 
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FIG. 1: d(Tmax) I 'dp, as a function of the network parame- 
ter fj,. The figure illustrates the dependency of the standard 
routing protocol on the local properties of the network. The 
inset shows the variation of c with fj,. The size of the network 
is N = 10 4 nodes and m = m = 3. The degree distribution 
is a power law with exponent equal to 3. Note that the BA 
limit corresponds to fi = 1. See the text for further details. 



their destinations and the sources are chosen at random. 
In subsequent time steps, each node i holding a packet 
sends it to its destination j following the shortest path 
length between node i and j until all packets reach their 
destinations. That is, each packet is diverted in such a 
way that the distance <£y, measured as the number of 
nodes one needs to pass by between i and j, is mini- 
mized. In the case that there are more than one possible 
path, the choice is made at random. This recipe has been 
recently studied in generic network models [la , IHj 

The above procedure is repeated many times for a 
number of processes ranging from p = 1 to at least 
p = 500. Different realizations of the dynamics for the 
same p are performed in order to average the relevant 
quantities. As a measure of the efficiency of the process, 
we have monitored (T max ), the maximum time it takes 
for a packet to travel from its source to its destination, 
averaged over different realizations |15| . The numerical 
results show that this magnitude scales linearly with the 
number of processes, so that its derivative is a proper or- 
der parameter to characterize the routing performance. 
Figure Q shows the slopes of the straight lines as a func- 
tion of the control parameter /i which determines the lo- 
cal properties of the network. It is clear from the figure 
that the algorithm's outcome depends on the topologi- 
cal details of the network. For this network the average 
shortest path length L is roughly the same as that of the 
BA network up to a value of /i around 0.2. However, in 
this range of /i the efficiency has a maximum and a min- 
imum. This implies that other properties are responsible 
for the behavior observed, namely the clustering coeffi- 
cient c. For the network under analysis [T]| , it turns out 
that as n is decreased from 1 (the BA limit) to smaller 
values, c first slightly decreases below the BA value (ac- 



counting for the maximum) and then starts increasing up 
to values near the ones of real networks (see the inset in 
Fig. (IJ. From this perspective, it is then clear that the 
routing process is controlled by c in the region where L 
is constant. When c increases, there appear new connec- 
tions among the neighbors of a node and the number of 
shortest paths raises. Now, packets can circumvent more 
easily cogested nodes, thus making the standard proto- 
col more efficient. When c decreases, the opposite occurs. 
For very small fi, L diverges 01 leading to a bad per- 
formance of the protocol since the algorithm works on 
a shortest-path-delivery basis. The crossover from the 
minimum to the divergence of d(T max )/dp is achieved in 
the parameter region where the interplay between c and 
L breaks down. 

The preceding analysis shows that the routing proto- 
col may be very sensitive to local details of the network 
on top of which the spreading process is taking place. It 
is then advisable the use of real nets in order to obtain 
reliable results. To this end, we have used the Inter- 
net Autonomous System map at the Oregon route server 
dated May 25, 2001 [16J. It is worth stressing that each 
AS groups many routers together and the traffic carried 
by a node is the aggregation of the traffic generated at 
the internal routers and on individual end-host flows be- 
tween the ASs. 

The first modification of the routing mechanism is in- 
troduced by noting that the standard procedure does not 
take into account the traffic on the network. Specifically, 
a routing policy based on the shortest path between two 
given nodes neglects the queue in overloaded nodes which 
makes the process slower as the queue lengths become 
larger. That is, it may be more efficient to divert a packet 
through a larger but less congested path. Let us hence 
assume that a node I is holding a packet that should be 
sent to a node j and define an effective distance d % e ^ 
from a neighboring node i of I to the destination j as 
d l e fj = di + Ci, where di is the shortest path between 
node i and j and Ci is the number of processes (or pack- 
ets) in the queue of i. The above definition, however, 
does not allow us a direct comparison with the standard 
procedure. It is convenient to redefine the effective dis- 
tance as <^e// = hddi + h c Ci so that the limit h c = 
contains the standard protocol. Furthermore, without 
loss of generality, we assume h c i + h c = 1 . This algorithm 
will be called deterministic protocol henceforth. The rest 
of the algorithm remains the same as before, i.e., at each 
time step, the remaining packets are delivered in such a 
way that the path chosen is that which minimizes d l e ^. 

A first look at the dynamics shows that a protocol im- 
plemented in this way is more efficient than taking into 
account only the shortest path criterion. In fact, (T max ) 
departs from the linear behavior previously observed and 
is well below the straight line up to a high p. This be- 
havior clearly depends on hd, since it is straightforward 
to realize that if hd is zero, the packets are diverted fol- 
lowing the less loaded node regardless of the path length 
which results in an uncontrolled increase in the distance 



3 



3500 
3000 
2500 - 
2000 - 



*y 1500 



>h d =1.0 
i h d =0.75 
• h d =0.50 
'h,=0.25 





2000 4000 6000 8000 10000 
P 



10 



10 



H 
v 



10 



10 




2000 4000 6000 8000 10000 
P 



FIG. 2: (Color online) Dependency of {T max ) on the num- 
ber of initial packets p in the deterministic limit of the model 
(f3 = 20) ran on top of an AS Internet map made up of around 
11000 nodes. Each point is an average over at least 200 re- 
alizations. The standard protocol corresponds to the limit 
hd = 1. Note that although the tendency of the curves is to 
cross the straight line as p increases, there is an optimal value 
of hd such that the interception would take place in the limit 
of very heavy traffic. 



traveled by the packets from the sending nodes. 

The above algorithm can be further generalized by in- 
cluding a probabilistic view. In other words, once we have 
determined the d l e jj for all sending nodes i, we can al- 
low for a stochastic choice of the paths. Hence, our third 
algorithm, referred to as stochastic protocol considers a 
score function or "energy" Hi = hddi + (1 — h c i)ci and 
that the probability H that a packet is sent precisely 
through node i is given by, 



-PHi 



e- 13 "; 



where (3 is the inverse of the temperature. In the limit 
(3 — ► oo (at zero temperature) we recover the determin- 
istic protocol. 

Figure [21 shows the dependency of (T max ) on the num- 
ber of packets p for several values of hd in the determin- 
istic limit of the model, which we found to be fulfilled for 
(3 = 20. A dynamics which does not take into account 
the amount of traffic handled by the neighbors of a sender 
node —straight line in Fig. [3J- performs worse than the 
one which integrates both ingredients. However, this de- 
pends on the specific weight of each metric in Hi and on 
p. In the regime where the traffic is not heavy (small p 
values) all curves are below the standard protocol per- 
formance, but as the amount of traffic handled by the 
network increases, the deterministic protocol starts per- 
forming worse for a range of hd values. From the results, 
it seems that eventually, when the traffic increases too 
much, the curves cross the straight line indicating that 
at those limits the shortest path strategy is best suited. 
Note, however, that for hd — 0.75 the convergence of 



FIG. 3: (Color online) Dependency of (T max ) on the number 
of packets, p, for a middle j3 = 5 value. The network param- 
eters are as in Fig. [5] In this case, the hd range in which the 
stochastic strategy performs better than the standard one is 
reduced. 



the two algorithms occurs for a very heavy load. Conse- 
quently, we can assert that there is an hd region where 
the combination of the two ingredients gives rise to the 
best performance. On the other hand, the existence of 
an optimal hd value distinct from zero can be understood 
by noting that a mechanism lacking some degree of path 
length information between the source and destination 
nodes of the packets performs badly because the packets 
travel along too large paths that do not compensate the 
time they would loose trapped in the queues of congested 
nodes. 

The completely stochastic limit of the model corre- 
sponds to (3 = 0. The performance of the protocol in this 
limit is however very bad. In fact, for an infinite temper- 
ature, all neighboring nodes of a given sender have the 
same probability to receive the message, and the dynam- 
ics becomes a random walk process. With no topological 
information about what are the destinations of the pack- 
ets, they arrive to the receiver at longer times and the 
algorithm is the worst. For intermediate values of f3, 
we have an stochastic dynamics in which topological and 
traffic information coexist. This is the case depicted in 
Fig.|3]for the same values of hd used in Fig. [3 As can be 
noted from the figure, the stochastic protocol increases 
(Tmax) by at least one order of magnitude as compared to 
the deterministic limit ((3 — oo). Moreover, the standard 
approach seems to be the best choice for a wider range 
of hd values, although hd = 0.75 still performs better. 

Figure 0] summarizes our results for different values of 
the control parameters (3 and hd- It turns out from the 
study of the whole phase diagram that the best algo- 
rithm is one which includes information about both path 
lengths and congestion at a local scale. Besides, the de- 
terministic limit with hd = 0.75 gives the best results for 
(T m ax)- It would be worth noticing at this point that, 
although the Fig. 0] was obtained in not too heavy traffic 
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FIG. 4: (Color online) Phase diagram of the system's dynam- 
ics. The network parameters are as in Fig. [5] The number of 
processes is p = 500. Calculations for higher p show that the 
minimum of (T max ) is also attained around ha = 0.8 ± 0.1. 



conditions, the results are consistent for larger values of 
p. Different tests allow us to conclude that the optimal 
value is /id = 0.8 ± 0.1. In any case, this confirms that 
it would be possible to device more elaborated protocols 
with the aim of diminishing the time needed for a packet 
to spread through the network. In light of the present 
results, such an strategy may be implemented by also 
taking into account the amount of traffic handled by a 
local area of the network. 

As suggested by Fig. 0] the best protocol is the deter- 
ministic one, which, on the other hand, should be more 
easy to implement in practice. The microscopic dynamics 
of the routing process in this limit reveals that it is desir- 
able that the routing process incorporates some knowl- 
edge of the node's queue lengths. However, the contri- 
bution in the score function of the second term should 
not weigh in excess. For small values of hd, say 0.25, the 
algorithm performs better that the standard one because 
the packets do not pass by the hubs of the network, which 



are likely to be in the shortest path route to any node. 
Instead, they go around the hubs and (T max ) is smaller. 
If p is increased, the neighbors of the hubs also get con- 
gested. This leads to a situation in which the packets 
around a hub get trapped in its neighborhood, getting in 
and out from it, but without finding their routes to their 
destinations. 

We finally point out that the improved strategies stud- 
ied in this paper should not be hard to implement in 
practice. Actual protocols use the topological informa- 
tion used by the different variants explored. One would 
only need to provide further information on the traffic 
status in a local area. This could be done, for instance, 
by using the keep- alive messages that routers continu- 
ously exchange with their peers, though in this case there 
is some time delay due to the time scales of the routing 
process and the database or message exchanges. 

In summary, we have studied alternative strategies for 
traffic delivery in complex heterogeneous networks. The 
results showed that the performance of the standard ap- 
proach is sensitive to local topological changes. Specif- 
ically, the clustering properties may play a key role in 
message delivery. On the other hand, algorithms which 
integrate topological and traffic information have been 
shown to perform better that the standard protocol. Fi- 
nally, we note that the present results could be extended 
to the Internet map at the router level which is statisti- 
cally equivalent to the AS map used here. 
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